From Vocoder to Vocagock— Speech Recognition Machines Stfll Have a Long Way to Go

ثبت نشده

چکیده

Ever since computers were first built, people have been fascinated with the idea of someday talking with them. Such a capabilhy has long been a staple of science fiction. For example, Arthur C. Clarke’s 2001: A Space Odyssey features casual conversation between the crew of a spaceship and their computer, Hal. ] This sort of easy idiomatic communication between humans and machines will continue to exist strictly within the realm of science fiction for many years to come. But there are simpler machines that can respond appropriately to a few voice commands. Some of them are at work on production lines. 2 Back in the 1940s the Vocoder, developed at Bell Laboratories, was thought to be the key to the voice-activated typewriter.3 Now futurists predict that the “understanding typewriter” is not far away. Imagine your office equipped with an understanding typewriter. To write your colleague, you simply dictate into a microphone connected to the typewriter. The machine instantly types out your critique of his last paper, corrects your grammar, and eliminates the “ahs” and “uhs.” Speech recognition devices are not to be confused with optical character recognition (OCR) machines, which I have discussed previously.d The Kurzweil reader is an example of an OCR. It can read aloud from a printed page in a synthesized electronic voice. I might add, however, that in spite of its other successes, the Kurzweil OCR reader cannot yet be adapted to ISI@’s data input needs. Speech recognition systems must accept voice commands instead of printed characters as an input, and they must correctly identify each word. I noted that there were still significant problems with OCR technology. But as we shall see, the problems associated with speech recognition are far greater. Both computers and humans find it easier to talk than to listen. I recalf reading about the Vocoder in a collection of essays entitled Bibliography in an Age of Sciences Then in the early 1960s, the Sperry Gyroscope Company invented the Sceptron, a device that identtiles sound waves by their frequency content. 1S1’s Irv Sher developed an application. In 1%5, he patented a door lock that would open only in response to an individual’s voice. 6 The door mechanism was called Vocalock. To operate Vocalock, you first pushed a button to activate the system, and then spoke into a microphone in the door. The system analyzed the sound and opened the lock if it recognized the voice. Vocalock could be programmed to recognize any number of individual voices. Interestingly enough, Robert Heinlein in hk 1961 book Stranger in a Stmnge Land described a future in which voice-operated locks are commonplace.7 It should be noted at this point that the term “speech recognition” can apply to several types of machines. The Vocalock, for example, was a device

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Simulations of High-Frequency Vocoder on Mandarin Speech Recognition for Acoustic Hearing Preserved Cochlear Implant

Vocoder simulations are generally adopted to simulate the electrical hearing induced by the cochlear implant (CI). Our research group is developing a new four-electrode CI microsystem which induces high-frequency electrical hearing while preserving low-frequency acoustic hearing. To simulate the functionality of this CI, a previously developed hearing-impaired (HI) hearing model is combined wit...

متن کامل

Speech Processing

The processing of speech signals has a long and venerable history. As early as 1770 Wolfgang von Kempelen demonstrated his mechanical talking machine to the courts of Europe. In 1928 Homer Dudley invented the “vocoder” (voice coder) arguing that speech is specified by a few slowly varying parameters requiring only a fraction of the telephone bandwidth for transmission. A digital vocoder was fir...

متن کامل

Towards Continuous Online Learning Based Cognitive Speech Processing

Despite the substantial progress of the speech processing technology, it is generally acknowledged that we have a long way to go before developing ASR systems which exhibit performance approaching that of humans. Many researchers believe that simply extending our current theories and practical solutions may never lead us to that goal. One promising research direction is development of learning ...

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

From Vocoder to Vocagock— Speech Recognition Machines Stfll Have a Long Way to Go

ثبت نشده

چکیده

منابع مشابه

Speech Emotion Recognition Using Scalogram Based Deep Structure

Simulations of High-Frequency Vocoder on Mandarin Speech Recognition for Acoustic Hearing Preserved Cochlear Implant

Speech Processing

Towards Continuous Online Learning Based Cognitive Speech Processing

A Comparative Study of Gender and Age Classification in Speech Signals

عنوان ژورنال:

اشتراک گذاری